NSF PAR Search | NSF Public Access Repository

Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees

https://doi.org/10.1287/opre.2022.0511

Zeng, Siliang; Hong, Mingyi; Garcia, Alfredo (September 2024, Operations Research)

Researchers have introduced a new algorithm to estimate structural models of dynamic decisions by human agents, addressing the challenge of high computational complexity. Traditionally, this task involves a nested structure: an inner problem identifying an optimal policy and an outer problem maximizing a measure of fit. Previous methods have struggled with large discrete state spaces or high-dimensional continuous state spaces, often sacrificing reward estimation accuracy. The new approach combines policy improvement with a stochastic gradient step for likelihood maximization, ensuring accurate reward estimation without compromising computational efficiency. This single-loop algorithm, designed to handle high-dimensional state spaces, converges to a stationary solution with finite-time guarantees. When the reward is linearly parameterized, it approximates the maximum likelihood estimator sublinearly, offering a robust solution for complex decision modeling tasks.

Full Text Available

Search for: All records